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1. Introduction, Pilot Projects 

Development of FORM [|l]] has taken place mainly via what I call "pilot projects". These are 
science projects that are very demanding on algebraic systems and an efficient solution requires 
many new features. If FORM were to be a pure computer science undertaking, one would not have 
this and the result would be a product that is far less useful. The main pilot projects are/have been: 



• Three loop massless QCD (fixed moments). 

• The four loop beta function. 

• Three loop massless QCD in deep inelastic scattering (all moments). 

• The Karlsruhe projects. 

• Multiple Zeta Values. 

• Automatic One Loop calculations. 

The first project started with the Mincer ^ program and the need for extreme speed. This 
led to special commands and functions. This project ran from 1990 till 1996. 

The four loop beta function [Q] led to the development of the color package This required 
extensive treatment of antisymmetric functions and also some pattern recognition in the form of 
finding loops in index contractions. This project ran from 1996 till 1997. 

The third project was more than just an extension of the first. It needed completely new 
techniques. This led to facilities for formal summation in the form of the Summer package ^ and 
large scale storage for tables. All needed several new features in FORM. The project ran basically 
from 1996 till 2005. 

The need for computer power in Karlsruhe has led to the development of ParFORM [Q, ^, ^. 
This was mainly used for the 4-loop programs of Pavel Baikov [|I0|]. But the concepts of the 
ParFORM program were largely taken over in TFORM and as such stand also at the cradle of that 
program. Another related development is the Laporta-style [|ri|] program that was developed in 
Karlsruhe and led to communication channels between FORM and other programs. The ParFORM 
project has been declared completed recently and ran from 1995 till 2010. 

The Multiple Zeta Value [ p^ calculations form a more mathematical project. They created the 



need to solve very large systems of equations and have been a major test case of TFORM [13 1. It 
has led to completely new commands and new features in TFORM. This project ran from 1997 till 
2010. 

Since 2005 more attention is spent on the automated one loop calculations. This poses yet new 
requirements on FORM in the field of the manipulation of outputs and results. One can think here 
of factorization, code simplification and sophisticated print statements. This is the running project. 

As part of this ongoing development it is of course important that developers have access to 
hardware that is really up-to-date and preferably more advanced than what the average user has at 
the same moment. This way the system will be ready for efficient use by the time that this hardware 
becomes more common. This is most noticeable with the parallel developments. 
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Example: TFORM [13] was developed on a machine with 4 cores (Nikhef) in the days that 
everybody still had one core (or very rarely two). The past few years TFORM has been running 
mostly on eight cores (Karlsruhe and Zeuthen), and very recently Nikhef got a special computer 
for TFORM with 24 cores and 128 Gbytes of memory. By tuning TFORM more and more to such 
large numbers of cores, TFORM will be ready by the time everybody has access to such machines. 
At the moment a good TFORM computer, from the viewpoint of the user, would have 8 cores, a 
large memory (at least 32 Gbytes) and a very large and fast disk. And run LINUX ^ . 

In this talk we will shortly discuss a number of these pilot projects to see what they needed in 
(T)FORM and how they gave shape to it. The final project we discuss concerns the automated one 
loop calculations and their needs. This gives more insight in the future development of (T)FORM. 
Finally we will have a look at the most recent development: FORM is now open source and there 
is an internet forum for publicly discussing matters relating to FORM. 

2. Mincer 

When computing massless propagator graphs one can use integration by parts identities to 
reduce all one, two and three loop integrals to a set of three master integrals. These master integrals 
are known to sufficient powers in e = (4 — D) /2 for the purpose of three loop (and even four loop) 
calculations. 





To calculate higher Mellin moments of structure functions one has to consider scattering dia- 
grams and take N derivatives with respect to the parton momentum P after which P is set to zero. 
These higher derivatives cause many tensorial and combinatorical problems and two functions (dis- 
trib_ and dd_) needed to be invented to deal with this properly. The strong point of especially dd_ 
is that it gets the combinatorics right and terms are not generated multiple times. 

'On non-UNIX operating systems usually one or more features are missing. For instance the GMP does not work 
on Apple computers and Windows cannot handle the POSIX threads of TFORM. 
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Example: 

Vector Q,pl,p2,p3; 
Indices il, . . . , ilO; 
Tensor T; 

L Fl = <Q (il) >* . . . *<Q (ilO) >; 
L F2 = Q.pl'^3*Q.p2^3*Q.p3'^4; 
ToTensor, Q, T; 
Print; 
. sort 



Fl = 

T(il,i2,i3,i4,i5,i6,i7,i8,i9,il0); 

F2 = 

T (pi , pi , pi , p2 , p2 , p2 , p3 , p3, p3, p3) ; 



id T (?a) = dd_(?a) ; 
. sort 



Time = 0.00 sec Generated terms = 945 

Fl Terms in output = 945 

Bytes used = 33348 

Time = 0.00 sec Generated terms = 9 

F2 Terms in output = 9 

Bytes used = 572 

if ( expression (Fl ) ) 
Multiply pi (il) *pl (12) *pl (13) 

*p2 (14) *p2 (15) *p2 (16) 
*p3 (17) *p3 (18) *p3 (19) *p3 (110) ; 

. end 



Time = 0.00 sec Generated terms = 945 

Fl Terms in output = 9 

Bytes used = 572 

Time = 0.00 sec Generated terms = 9 

F2 Terms in output = 9 

Bytes used = 572 



A function like dd_ is also useful for one loop integration when one replaces 
id Q(il?) *Q(i2?) = d_(il,i2) *Q.Q/D 
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id Q (il?) *Q (12?) *Q (13?) *Q (14?) = 

dd_{il, 12, 13, 14) *Q. 0^^2/0/ {D+2) 
Id Q (11?) *Q (12?) *Q (13?) *Q (14?) *Q (15?) *Q (16?) = 

dd_{ll, 12, 13, 14, 15, 16) *Q. 0^^3/0/ {D+2) / {D + 4) 

Many of the other features that were mtroduced during the development and use of the Mincer ^ 
package are considered completely standard by now. 

3. Ensum 

The way N-dependent moments are computed is not by writing the derivatives out as sums 
and then working ones way through the Mincer algorithms, introducing more and more sums when 
the integration by parts identities are applied. This has been tried but only in the simplest two loop 
cases this has given results. In general this is too difficult. 

The way that is used is by deriving recursion relations in the parameter N [ |l4| ] and then either 
summing the recursion, or when it is a higher order difference equation, solving it by brute force. 
This involves solving large sets of hnear equations. 



1 1 




1 1 




N+l-n 



2P q , , ~ 1, 

+ ( (D-4-N-E+n) 

q-q 



1 1 




An example of an integral to be solved by a difference equation is 




N022{N) X = I d^prd^d'^p, , . ,,,^^2 2 2 2 2 2 

Q Q J pi{pir+^ pipipiPePjPi 
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Generically the equation looks like 

ao(,N)F{N)+ai{N)F{N-l) + --- + a,n{N)F{N-m) = G{N) 

It is solved by making an ansatz containing many functions, substituting it and solving the resulting 
system of linear equations. One needs m fixed values for the boundary conditions. 
For this diagram the equation is a third order equation with: 

GiN) = l2S-2iN)i{-4N + 2){-lf + \) + l2S2iN) 

+24(1 - (-if )/A^- 12(1 - (-1)^)/A^^ 
aQ{N) = N{N -2e){N + 2e){N + I - 2e){3N + 3 + 2e)/2 
ai{N) = {N-2e){\5N£ + 4N£^-3N-\SN^e 

-\0N^£^ + 9N^ + 5Nh- 9N^ + 3N'^ 

-2e + 6e^-Se^ + ie^)/2 
a^iN) = {N -l){l2Ne-2%Ne^ -me^ 

-60N^e + AAN^e^ + 52N^e + 

+6N^ + + 56e^ - 1 12£^)/4 
fl3(A^) = (A^-l)(A^-2)(3A^ + 2e)(A^-l + 3e)(A^-l + 6e)/2 

In the case of our diagram the answer is rather simple (this is exceptional): 

F{N) = 0(A^)i±H)!_i_(+2OC5 + 125_3,-2(A^+l) 

+45_3,2(A^+ l) + 85_2(A^+l)C3+45_2.-3(A^+ 1) 
-45-2,3 (A^ + 1 ) + 8-^2 (A^ + 1 ) C3 + 452, -3 (A^ + 1 ) 
-452,3 (A^+ 1) + 1 253,-2 (A^+ 1) + 453,2 (A^+ 1)) 

Even then this turned out to be too demanding on the computers we had and it was needed to 
store all intermediately obtained integrals in a large set of tables. 

The problem with tables is that they have to be compiled at the start of the program. Even at a 
few Mbytes/sec compiling 3 Gbytes of tables at the start of each program is not nice when you are 
developing new code. 



Hence a special database system for tables was designed: the tablebase [15]. This has the 
tables in a special file (gzipped) and only tells FORM which elements there are. Then, when 
needed, only those elements that are actually used are compiled and applied. This turns out to 
work very well. 

The whole made it possible to compute the anomalous dimensions and coefficient functions 
of three loop DIS in QCD [[g, [T7|, 0]. 



4. Multiple Zeta Values 

Harmonic sums [19| are defined by [20, p[ ]: 
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!=1 

N ^ 
!=1 ' 

This is a notation that is also suitable for computers. There is a difference here between various 
definitions as there are also people using / — 1 for the argument of the S in the recursive formula. 
Those sums we call Z-sums. 

The harmonic polylogarithms [ ^ ] are defined by: 

H{0;x) = Inx 

H{\;x) = / ^ = -ln(l-^) 

JO 1 - x' 

P dx' 

H{-\;x) = / =ln(l+x) 
JO 1+r 

and the functions /(0;x) = i, f{l;x) = j^^, /(-l;x) = -^^ 
If dw is an array with w elements, all with value a, then: 

H{d„;x) = -^In^'x 
w! 

H(a,m„;x) = / dx f(a\x') H{m„-,x) 
Jo 

The weight is the number of indices in integral notation. These indices are either one or zero 
or minus one. The depth is the number of indices in sum notation in which there can be all integer 
numbers with the exception of zero. The sum of the absolute values of the indices in sum notation 
is equal to the weight. Harmonic sums are the Mellin transforms of the harmonic polylogarithms. 

In the ensum project we needed these objects only to weight 6 and weight 5 respectively. What 
was important was that we needed the harmonic polylogarithms in one (or the sums in infinity). 
There are many relations between them and because of that there are only very few that are linearly 
independent. This is very relevant as seen in the next example: 

tdefine SIZE "6" 
tinclude- harmpol.h 
Off statistics; 
. global 

Local F = S {R{-1, 3,-2) ,N) ; 
tcall invmel (S, N, H, x) 
Print +f +s; 
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. end 



+ H{R{-l,-3, 0) ,x) * [1-x] ^-1 

- l/2*sign_(N) *H{R(1,0,0) ,x)*[l+x]^-l*z2 
+ 1/2*H (R(-l, 0, 0) ,x) * [1-x] '^-l*z2 

+ 3/2*H (R(-l, 0) ,x) * [1-x] ^-l*z3 

+ 21/20*H (R(-l) ,x) * [1-x] ^-l*z2^2 

- 51/32* [1-x] -^-l*z5 
+ 3/4* [1-x] -^-I*z2*z3 

- 7/2*s6 

+ 51/32*z5*ln2 

- 33/64*z3^2 

+ 9/4*z2*z3*ln2 

+ 121/840*z2^3 

- 51/32*sign_{N) * [1+x] ^-l*z5 
+ 3/4*sign_{N) * [1+x] ^-I*z2*z3 

} 

0.28 sec out of 0.33 sec 

The above is a relatively short answer (14 terms). But this takes into account that there are 
many relations between the harmonic sums in infinity (or the hpl's in one). If we don't use these 
relations we have the result 

+ H(R(-l,-3,0) ,x) *[l-x]^-l 

- sign_(N) *H(R(l,0,0),x)*Z(-2)*[l+x]^-l 
+ H (R(-l, 0, 0) ,x) *Z (-2) * [1-x] ^-1 

+ 2*H (R (-1, 0) , x) *Z (-3) * [1-x] ^-1 
+ 3*H (R (-1) , x) *Z (-4) * [1-x] ^-1 

- sign_{N) *Z (-2, -3) * [1+x] ^-1 



+ 


6*Z (- 


■4,-1,1) 


+ 


3*Z (- 


•4,1,-1) 


+ 


5*Z (- 


3,-2,1) 


+ 


4*Z (- 


■3,-1,2) 


+ 


Z(-3, 


1,-2) 


+ 


3*Z (- 


■3,2,-1) 




Z (-2, 


-3) * [1-x] 




2*Z {- 


2,-3,-1) 


+ 


5*Z {- 


■2,-3,1) 


+ 


Z (-2, 


-2,-2) 


+ 


3*Z (- 


2,-2,2) 


+ 


2*Z {- 


■2,-1,-3) 


+ 


2*Z (- 


2,-1,3) 


+ 


Z(-2, 


2,-2) 


+ 


3*Z (- 


2,3,-1) 


+ 


3*Z (- 


■1,-4,1) 


+ 


2*Z (- 


1,-3,2) 


+ 


z(-i. 


-2,-3) 


+ 


Z (-1, 


-2,3) 


+ 


Z(-l, 


3,-2) 


+ 


3*Z (- 


1,4,-1) 









Now we have 27 terms ! 
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It is an interesting mathematical problem to see how many of these hpl's in one exist for a 
given weight. The only two ways know thus far to compute this are 

• Determine a given object numerically to a very large number of digits. Guess a basis and 
evaluate the elements of this basis to the same accuracy. Then use a program like PSLQ 
or the LLL algorithm to determine an integer relation between them. This may or may not 
succeed, depending on the accuracy used. 

• Determine for a given weight all relations between the objects and solve this set. This can be 
done either as a matrix problem or formally with a computer algebra system. The power of 
the system determines how far one can go. 



Although there exist formula's [g3|, g4|] for the number of basis elements for given weight 
and depth, they have not been proven and sometimes surprises still show up (as happened in this 
research). The case of weight 27 was very special (a new phenomenon was expected to occur there) 
and finally solved (modulus a 31 -bits prime number) recently in a job of 85 days on an 8-core Xeon 



computer at DESY Zeuthen [|25[]: 



171258.46 sec + 55845418.93 sec: 56016677.39 sec out of 7345664.84 sec 

As one can imagine, such calculations require optimal use of the hardware and several new 
features had to be added to (T)FORM. The effective use of the cores left only less than 5% idle 
time during the whole job. This included occasional traffic jams at the single disk being used in 8 
parallel disk sorts. 

Some of the new features [^] are 

• The family of transform statements. 

• The InParallel option for TFORM to process large numbers of small expressions in parallel. 

• The use of the bracket index to divide the tasks over the workers. 

And then there was the debugging of lots of features that had been used only rarely and hence were 
far from perfect. 

5. Automated One-Loop Calculations 

Originally FORM development started just for this problem. The name of the complete project 
was ESP (Experiment Simulation Program) and at the core of it a powerful symbolic manipulator 
was needed. The idea was to use an amplitude approach based on an advanced (at that moment) 
spinor library named Spider ^ which had excellent numerical properties. 

Hence in 1984 FORM development was started, but it took, of course, much longer than 



estimated and by the time it became operational (1989) the Grace [ |29[ ] system was well under 
development. Also I got sidetracked into three loop QCD to show off the power of FORM. 

As a result the ESP system was never completed and it was judged wiser to join the Grace 
effort to reach the goal of automated one loop calculations. 



Of the spider approach only internal notes exist. 
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But the project also resulted in the FF program by van Oldenborgh 28]. 

One of the main problems in automated one loop calculations is organization. If the power 
of (T)FORM would not be sufficient, no other program would be able to deal with it. The main 
problem is the presentation of the output. The method used in the Grace system produces lengthy 
FORTRAN outputs and this in turns presents the FORTRAN compiler with unsurmountable com- 
plications. Hence the natural approach seems to be to try to make the outputs shorter by what is 
called code simplification. An example would be that 



xl*x3 + xl*x4+x2 *x3+x2 *x4+x5 



is replaced by 



zl 
z2 
F = 



= xl+x2 
= x3+x4 
z 1 * z 2 + x5 



in which we save three multiplications and one addition. 

Let us go to the current test reaction e^e^ — )• je^e^ . There are two ways to attack this problem. 
The first way is to calculate the matrix element squared. This has been implemented [^] and a 
certain amount of simplification has been built in at the level of FORM code. This is rather slow 
and far from perfect. It gives an improvement of a factor between three and five. The whole 
reaction produces i^(lO^) subroutines which, after improvement use 63 10^ additions and 70 10^ 
multiplications. The code can be compiled and made into a single executable, provided we use 
double precision. In quadruple precision the executable is too large (larger than 2 Gbytes) and the 
relocation mechanism of the GNU system is not up to the task. 

Another way would be to compute the amplitude. This has advantages and disadvantages. The 
obvious disadvantage is that we have to deal with spinors and spin orientations. The advantage is a 
better numerical behaviour and an expression that is in principle linear in the number of diagrams. 
A sample input diagram is 

7 

1 — *• — fWVWXt — — 5 



'WW 3 



— J^AAAAA> 

z 



*vfb{fl0,p2, amel) 

*f fvn { 'czell' , 'czel2' , f 10 , p2 , 18 , -11 , m8c) 
*sf n (f 10, 18, ^amel' ) 

*f fvn { 'caell' , 'cael2' , f 10, -18, 16, -p3, n2a) 
*sfn (flO, 16, 'amel' ) 
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*f fvn { 'caell' , 'cael2' , f 10 , -1 6, pi , kV , m5c ) 
*uf {flO,pl, amel ) 
*ufb{fll,p4, amel) 

*ffvn { 'czell' , ^czel2' , fll, -p4, -19, 110, m9c) 
*sf n (f 11, -19, ^amel' ) 

*f fvn { 'caell' , 'cael2' , f 11, 19, -p5, -k7, m7c) 

*vf {fll,p5, amel ) 

*epsv{n2a,p3, ama) 

*dvn {m7c, m5c, k7, 'ama') 

*dvn (mSc, m9c, 110, 'amz' ) 

*num (2500) *loop{5) 

*moml {q6, +pl ) 

*moml {q8, +pl-p3) 

*moml (q9, +p5) 

*moml (qlO, +p4+p5) 

*mom2 (2, 19, +q9+k7) 

*mom2 (3, 110, +ql0+k7) 

*mom2 (4, 18, +q8+k7) 

*mom2 (5, 16, +q6+k7) 

*mom3 {k7 , Q) 

We can see here the spinors. One way to deal with them is the 'spider way', i.e. project them 
out onto the S,P,V,A,T currents and use the 10 spider relations to eliminate the tensor currents and 
contractions of the V and A currents with Levi-Civita tensors. When we bracket out the spinor 
and polarization vector dependent pieces there are 'only' 580 different spin dependent objects 
that have to be computed 16 times. This means that we have to compute i^(lO^) spin related 
quantities, compute 580 scalar expressions and multiply those in. This is all very little compared 
to the millions of terms inside those 580 expressions. 

Inside these expressions we have the loop integrals. We deal with them the 'Grace way' [^]. 
We can arrange in such a way that we have to compute each only once. There are in total 429 
different loop integrals with their tensor structures. This in a total of 3456 diagrams of which 3236 
have a loop to be computed (the rest have counterterms). In contrast the matrix element squared 
method needs to calculate a loop integral 3236 times. 

At the moment we have no system of optimization yet and there are (^(38 10^) additions.and 

1^(280 10^) multiplications. The fact that already there are fewer additions gives good hope that 

after optimization this will be much shorter than the matrix element squared method as typical is 

about one multiplication per term after optimization. 
Example: 

+L97 (0) * ( 

+ 9/16*amel''2*zk'-2*inf *Z_79*Z_78*Z_76*Z_75*Z_74*Z_73'"2*Z_69*Z_21*Z_1 
-15/8*amel''2*zk'^2*inf *Z_7 9*Z_7 8*Z_7 6*Z_75*Z_7 4*Z_73'^2*Z_6 9*Z_21*Z_12 
+ 9/4*amel''2*zk'^2*inf *Z_79*Z_78*Z_76*Z_75*Z_74*Z_73'~2*Z_69*Z_21*Z_13 
-ame l'^2*zk'^2*inf* Z_7 9 * Z_7 8 * Z_7 6 * Z_7 5 * Z_7 4 * Z_7 3 2 * Z_6 9 * Z_2 1 * Z_l 5 
+ 9/16*amel''2*zk'^2*inf *Z_79*Z_78*Z_77*Z_76*Z_75*Z_74*Z_73*Z_69*Z_21*Z_1 
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-15/8*amel''2*zk'~2*inf *Z_7 9*Z_7 8*Z_7 7*Z_7 6*Z_75*Z_74*Z_73*Z_69*Z_21*Z_12 

+ 9/4*amel-^2*zk'"2*inf *Z_79*Z_78*Z_77*Z_76*Z_75*Z_74*Z_73*Z_69*Z_21*Z_13 



-amel'^2*zk' 


^2*inf 


"*Z_7 9*Z_ 


_7 8*Z_ 


_77*Z. 


_76*Z_ 


_75*Z_ 


_74*Z_ 


_73*Z_ 


_69*Z_ 


_2 1 * Z. 


-15 


+9/16*amel' 


^2*zk'- 


2*inf *Z_ 


_80*Z_ 


_78*Z. 


_77*Z_ 


_76*Z_ 


_75*Z_ 


_74*Z_ 


_73*Z_ 


_69*Z. 


_21*Z_1 


-15/8*amel' 


^2*zk-" 


2*inf *Z_ 


_8 0*Z_ 


_7 8*Z_ 


_77*Z_ 


_76*Z_ 


_75*Z_ 


_74*Z_ 


_73*Z_ 


_6 9*Z_ 


_21*Z_12 



+ 9/4*amel-^2*zk'"2*inf *Z_80*Z_78*Z_77*Z_76*Z_75*Z_74*Z_73*Z_69*Z_21*Z_13 
-amel'~2*zk'^2*inf *Z_80*Z_7 8*Z_7 7*Z_7 6*Z_75*Z_7 4*Z_73*Z_69*Z_21*Z_15 
+ 9/16*amel'"2*zk-^2*inf *Z_8 0*Z_7 8*Z_7 7'~2*Z_7 6*Z_7 5*Z_74*Z_69*Z_21*Z_1 
-15/8*amel''2*zk-^2*inf *Z_80*Z_78*Z_77^2*Z_7 6*Z_75*Z_74*Z_69*Z_21*Z_12 
+ 9/4*amel''2*zk''2*inf *Z_80*Z_7 8*Z_7 7'~2*Z_7 6*Z_75*Z_7 4*Z_69*Z_21*Z_13 
-amel-^2*zk'^2*inf *Z_8 0*Z_7 8*Z_7 7-^2*Z_7 6*Z_7 5*Z_7 4*Z_69*Z_21*Z_15 
-9/32*amel''2*Ndim*zk-^2*inf *Z_7 9*Z_7 8*Z_7 6*Z_7 5*Z_7 4*Z_73-^2*Z_69*Z_21*Z_1 
+ 15/16*amel'~2*Ndim*zk'~2*inf *Z_7 9*Z_7 8*Z_7 6*Z_75*Z_7 4*Z_7 3'^2*Z_69*Z_21*Z_12 
-9/8*amel-^2*Ndim*zk-^2*inf *Z_79*Z_78*Z_76*Z_75*Z_74*Z_73''2*Z_69*Z_21*Z_13 
+ l/2*amel-^2*Ndim*zk-^2*inf *Z_7 9*Z_7 8*Z_7 6*Z_7 5*Z_7 4*Z_7 3''2*Z_69*Z_21*Z_15 
-9/32*amel'^2*Ndim*zk'^2*inf *Z_7 9*Z_7 8*Z_77*Z_7 6*Z_75*Z_7 4*Z_73*Z_69*Z_21*Z_1 
+ 15/16*amel'^2*Ndim*zk-^2*inf *Z_79*Z_78*Z_77*Z_76*Z_75*Z_74*Z_73*Z_69*Z_21*Z_12 
-9/8*amel-^2*Ndim*zk-^2*inf *Z_7 9*Z_7 8*Z_7 7*Z_7 6*Z_7 5*Z_7 4*Z_7 3*Z_69*Z_21*Z_13 
+ l/2*amel''2*Ndim*zk'~2*inf *Z_7 9*Z_7 8*Z_7 7*Z_7 6*Z_7 5*Z_7 4*Z_7 3*Z_69*Z_21*Z_15 
-9/32*amel''2*Ndim*zk'^2*inf *Z_80*Z_78*Z_77*Z_76*Z_75*Z_74*Z_73*Z_69*Z_21*Z_1 
+ 15/16*amel-^2*Ndim*zk-^2*inf *Z_8 0*Z_78*Z_77*Z_7 6*Z_75*Z_74*Z_73*Z_69*Z_21*Z_12 
-9/8*amel''2*Ndim*zk'~2*inf *Z_80*Z_7 8*Z_7 7*Z_7 6*Z_7 5*Z_7 4*Z_7 3*Z_69*Z_21*Z_13 
+ l/2*amel'^2*Ndim*zk-^2*inf *Z_80*Z_78*Z_77*Z_76*Z_75*Z_74*Z_73*Z_69*Z_21*Z_15 
-9/32*amel'"2*Ndim*zk^2*inf *Z_80*Z_7 8*Z_77'"2*Z_7 6*Z_75*Z_7 4*Z_69*Z_21*Z_1 
+ 15/16*amel'~2*Ndim*zk'~2*inf *Z_80*Z_7 8*Z_77'~2*Z_7 6*Z_75*Z_7 4*Z_69*Z_21*Z_12 
-9/8*amel-^2*Ndim*zk-^2*inf *Z_80*Z_78*Z_77-^2*Z_76*Z_75*Z_74*Z_69*Z_21*Z_13 
+ l/2*amel-^2*Ndim*zk-^2*inf *Z_80*Z_78*Z_77-^2*Z_76*Z_75*Z_74*Z_59*Z_21*Z_15 
) 

This code has 32 terms and 412 multiplications but it is relatively easy to squeeze it to 

+L97 (0) *zk^2*inf *amel^2* {Ndim-2) * {Z_79+Z_80) 
*Z_78*Z_77* {Z_77+Z_73) *Z_76*Z_75*Z_74*Z_69*Z_21 
* (-9*Z_1+30*Z_12-36*Z_13+16*Z_15) /32 

which involves 6 additions and 19 multiplications unless there are subexpressions that are common 
with other code in which case it is even less. 

The object L 9 7 ( ) is a scalar three-point function. The argument indicates which tensor 
integral is needed. We manage to store the powers of the various Feynman parameters in a single 
dimensional array in an optimal packing. This facilitates computing first all loop integrals and their 
tensor varieties and then using them from these arrays. This saves much time and space. 

5.1 Intermezzo 

If we have an N-point function, there can be at most N powers of the loop momentum in 
the numerator. This means that each Feynman parameter can have up to N powers and there are 
A'^ — 1 Feynman parameters. In an A'^ — 1 dimensional array there would be (A'^+ l)'^^^ elements but 
actually we need only |^^^f^}ly_ elements. A good mapping for ■ • -.^^^I'l to a single number K is 

N-\ 

Ki,,...,i,_, = B{2N-l,N) -l+Y^ i-\yB{lN-j-N,j) 

7=1 
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j 

Ij = 

k=l 
n 

B{n,m) = — fi(n— l,m— 1) m>0 
m 

B{n,0) = 1 

This can be programmed both in the FORM program and the FORTRAN program. In FORM it is 
much more compact. 

It should be clear now that the code optimization is dominantly important. In the above ex- 
ample a simple factorization would suffice, but unfortunately that is usually not the case. We need 
techniques as used in compilers, but we have extra liberties. In a compiler one is not allowed to 
assume the addition to be associative or commutative. Here we can. 

Of course the above compares are not completely fair. We have put the ampUtude as a single 
expression in FORM, while the matrix element squared method worked diagram by diagram. We 
are however not so far that we can try to put that into FORM as a single expression. In the end the 
expression might be comparable in size, but especially in the early stages it would be much larger. 
There are other complications concerning D-dimensional indices versus 4-dimensional indices, 
because now the 4-dimensional indices can arrange themselves into loop-like structures and one has 
to keep them unsummed in the beginning at great cost. This is all much easier with the amplitudes 
as the only indices that can occur as in Saa are the loop indices. Everything outside the loop can 
be taken 4-dimensional immediately. 

At the moment work on code improvement and factorization is in an advanced stage, but not 
yet near completion. It will be interesting to see how much the expressions can be squeezed. 

6. Open Source 

Starting 26 Aug 2010 FORM has become open source. This means that there is a web based 
CVS from which anybody can download the sources of FORM and TFORM. There are some tools 
for configuration but because we have access only to a hmited number of computers this is far from 
complete. Our hope is that users can make contributions here. 

The Ucense is the GNU PubUc License with the added hope that people will refer to the FORM 
publication when they use FORM for scientific publications. 

The reason behind this move is that in a number of years, we do not know how many, FORM 
will have to survive without its original author. For this it is important that more people familiarize 
themselves with the sources and make additions. This can eventually only be done when the sources 
are generally available. Even so, it is not that easy to make additions to FORM because the code 
is more than 3.2 Mbytes (currently) (118000 lines) and not all of it is extensively documented. 
But there does exist much documentation if one compares it with similar programs. There is a 
testsuite based on the Ruby system, a layout program based on doxygen and of course there are 
lots of LaTeX files with explanations. For some program segments there is much commentary and 
for some (mostly older) segments there is unfortunately not very much commentary. Occasionally 
commentary is added, especially after a difficult debugging session. 

Most of the work related to making FORM open source has been done by Jens VoUinga. This 
is fully in Une with having more and more people involved with the development. The current 
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drawback is that he will be leaving the academic environment. This may mean that he cannot 
spend much more time on FORM development (and GiNaC development). 
Currently several people are working on new pieces of FORM code. 

• Misha Tentyukov makes occasional additions as needed in Karlsruhe. 

• Jens Vollinga has made additions like systems independent .sav files. 

• Irina Pushkina works on code improvement for FORTRAN and/or C code. 

• Jan Kuipers works on rational polynomials, including factorization. If time is left in his 
contract he may create some facilities for Grobner bases. 

• Thomas Reiter has put in most of the FORTRAN90 output mode. 

In addition there are people who are very active in testing out new features and producing good 
bug reports. The importance of this should not be underestimated. 



7. The Forum 



To aid in dispersed development we (Jens Vollinga mainly) have set up a forum that allows 
people to communicate with each other. In principle this can be done without involvement of any 
of the main developers although, just in case, there will be moderators to remove inappropriate 
messages should they occur (like Spam). 

The forum is located at http : / /www.nikhef .nl / ~ form/ forum and anybody can read it. To 
post messages you have to be a member. Subscription is rather easy. 

For seeing how it works it is best to visit the site. 

8. Conclusions 

FORM development is slow work, but at the same time it makes steady progress. 
Hopes are that the open source policy will add more impetus to this development. 
Several projects are under way that will make outputs more compact. 

References 



[1] J. A. M. Ve,xma.se,xe,n, New features of FORM, math-ph/0010025 



[2] S. G. Gorishnii, S. A. Larin, L. R. Surguladze, and F. V. Tkachev, Comp. Phys. Comm. 55 (1989) 381 
[3] S. A. Larin, F. V. Tkachev, and J. A. M. Vermaseren, NIKHEF-H-9 1 - 1 8 

[4] T. van Ritbergen, J.A.M. Vermaseren and S.A. Larin, Phys. Lett. B400 (1997) 379, hep-ph/9701390; 
M. Czakon, Nucl. Phys. B710 (2005) 485, hep-ph/041 1261 

[5] T. van Ritbergen, A.N. Schellekens and J.A.M. Vermaseren, Int. J. Mod. Phys. A14 (1999) 41, 
hep-ph/9802376 

[6] J.A.M. Vermaseren, Int. J. Mod. Phys. A14 (1999) 2037, hep-ph/9806280 



14 



FORM development 



J.A.M.Vermaseren 



[7] D. Fliegner, A. Retey and J. A. M. Vermaseren, arXiv:hep-ph/9906426. 

[8] D. Fliegner, A. Retey and J. A. M. Vermaseren, arXiv:hep-ph/0007221. 

[9] M. Tentyukov, D. Fliegner, M. Frank, A. Onischenko, A. Retey, H. M. Staudenmaier and 
J. A. M. Vermaseren, arXiv:cs.sc/0407066. 

[10] Some recent results are found in: 

R A. Baikov and K. G. Chetyrkin, arXiv:hep-ph/0604194; R A. Baikov, K. G. Chetyrkin and 
J. H. Kuhn, Nucl. Phys. Proc. Suppl. 157 (2006) 27 farXiv:hep-ph/0602126]. 

[11] S. Laporta, lnt.J.Mod.Phys.A15(2000)5087; hep-ph/0102033 

[12] J. Blumlein, D. J. Broadhurst and J. A. M. Vermaseren, Comput. Phys. Commun. 181 (2010) 582 
[arXiv:0907.2557 [math-ph]]. 

[13] M. Tentyukov and J. A. M. Vermaseren, Comput. Phys. Commun. 181 (2010) 1419 
[arXiv:hep-ph/0702279]. 

[14] D. I. Kazakov and A. V. Kotikov, Nucl. Phys. B307, 721 (1988); ibid. B345, 299 (1990). 

[15] J.A.M. Vermaseren, Tuning form with large calculations. Nucl. Phys. Proc. Suppl. 1 16: 343-347,2003. 

[16] S. Moch, J. A. M. Vermaseren and A. Vogt, Nucl. Phys. B 688 (2004) 101 [arXiv:hep-ph/0403192]. 

[17] A. Vogt, S. Moch and J. A. M. Vermaseren, Nucl. Phys. B 691 (2004) 129 [arXiv:hep-ph/04041 11]. 

[18] J. A. M. Vermaseren, A. Vogt and S. Moch, Nucl. Phys. B 724 (2005) 3 [arXiv:hep-ph/0504242]. 

[19] L. Euler, Meditationes circa singulare serium genus, Novi Comm. Acad. Sci. Petropol. 20 (1775) 
140-186, reprinted in Opera Omnia ser I vol. 15, (B.G. Teubner, Berlin, 1927), 217-267. 

[20] J. A. M. Vermaseren, Harmonic sums, Mellin transforms and integrals. Int. J. Mod. Phys. A 14 (1999) 
2037-2076, [arXiv:hep-ph/9806280] . 

[21] J. Bliimlein and S. Kurth, Harmonic sums and Mellin transforms up to two-loop order, Phys. Rev. D 
60 (1999) 014018, 31 p., [arXiv:hep-ph/9810241] . 

[22] E. Remiddi and J. A. M. Vermaseren, Harmonic polylogarithms. Int. J. Mod. Phys. A 15 (2000) 

725-754, [arXiv:hep-ph/9905237] . 

[23] D. J. Broadhurst, On the enumeration of irreducible k-fold Euler sums and their roles in knot theory 
and field theory , arXiv : hep-th/ 9604128. 

[24] D. J. Broadhurst and D. Kreimer, Association of multiple zeta values with positive knots via Feynman 
diagrams up to 9 loops, Phys. Lett. B 393 (1997) 403-4-12, [ arXiv : hep-th/ 960 912 8]. 

[25] J. Kuipers and J. A. M. Vermaseren. In preparation. 

[26] J. A. M. Vermaseren, Nucl. Phys. Proc. Suppl. 205-206 (2010) 104 [arXiv: 1006.45 12 [hep-ph]]. 

[27] G. J. van Oldenborgh and J. A. M. Vermaseren, Z. Phys. C 46 (1990) 425. 

[28] G. J. van Oldenborgh, Comput. Phys. Commun. 66 (1991) 1. 

[29] R Yuasa et al. Prog. Theor. Phys. Suppl. 138 (2000) 18. 

[30] J. Fujimoto et al., Nucl. Phys. Proc. Suppl. 160 (2006) 150-154. 



15 



